Skip to content

Conversation

@MichaelHuth
Copy link
Collaborator

@MichaelHuth MichaelHuth commented Nov 11, 2025

@MichaelHuth
Copy link
Collaborator Author

@timjarsky Here is a first version to play with:

The new mode is called "sweepgroup".

For this mode avg requires as first argument an array of at least two selections.

group1 = select(selsweeps(0),selvis(all))
group2 = select(selsweeps(1),selvis(all))
group3 = select(selsweeps(0,1),selvis(all))

avg([$group1, $group2, $group3],sweepgroup)

What it does is the following:

  • the sweep data for each selection is retrieved
  • the sweep data of the nth entry of each selection is averaged

Selection to sweep data:

Sweep Sweep Sweep Sweep
Group 1 0 DA 0 AD - -
Group 2 1 DA 1 AD - -
Group 3 0 DA 0 AD 1 DA 1 AD

averaging now is done along the rows:

  • trace 1: avg(0 DA, 1 DA, 0 DA)
  • trace 2: avg(0 AD, 1 AD, 0 AD)
  • trace 3: avg(-, -, 1 DA)
  • trace 4: avg(-, -, 1 AD)

The non existing sweeps in a row are ignored.

The columns in the upper table are the sweep data that was retrieved for a given selection.
There are selection conditions that can result in no sweep data, e.g. when an epoch was chosen that does not exist in a specific sweep. So when doing that average one has to take care that the column to row relation in the table is actually correct.

@MichaelHuth MichaelHuth assigned timjarsky and unassigned MichaelHuth Nov 12, 2025
@timjarsky
Copy link
Collaborator

@t-b @MichaelHuth does this need to be limited to select operations as input? For example, I'd like to pass in apfrequency data (see sample SF data below).

selbase = select(selrange(E1),selchannels(AD0), selvis(all),selsweeps(), selstimset("*rheo*", "*supra*"))

selexpA = select(selexp("*.10A.*"), $selbase)
selexpB = select(selexp("*.10B.*"), $selbase)

datA = data($selexpA)
datB = data($selexpB)

apfrequency($datA)

and

apfrequency($datB)

and

avg(apfrequency($datA), apfrequency($datB), sweepgroup)

@timjarsky timjarsky assigned MichaelHuth and unassigned timjarsky Nov 14, 2025
@MichaelHuth
Copy link
Collaborator Author

MichaelHuth commented Nov 15, 2025

@timjarsky I made a change such that any dataset type elements for the first argument are accepted. For your sketched case the last line must be changed to avg([apfrequency($datA), apfrequency($datB)], sweepgroup) (note the enclosure in array brackets of the first argument of avg)

Now generally any input in the format
avg([<dataset>,<dataset>, ... ], sweepgroup) is accepted.

If the dataset is of type selection then automatically data is applied, such that e.g. avg([select(),select()], sweepgroup) -> avg([data(select()),data(select())], sweepgroup). This implicit conversion also applies for over and in mode.

It might also make sense to rename the new mode from sweepgroup to bygroup reflecting the more generic argument handling. (I am open for suggestions)

The logic as sketched with the selections and the table above also applies for the generic dataset input for matching what gets averaged.

@timjarsky
Copy link
Collaborator

timjarsky commented Nov 17, 2025

Hi @MichaelHuth ,

I tested avg([apfrequency($datA), apfrequency($datB)], sweepgroup) and it removes the x-data, that is, each average point is plotted against a single x value, even if vs. is provided.

The end goal is to input multiple FI curves from individual experiments, and generate an average FI response (multiple average frequency vs average current data points)

@timjarsky
Copy link
Collaborator

@MichaelHuth
Copy link
Collaborator Author

@timjarsky I made a change such that meta data like the sweep numbers is transferred from the first group in the array to the average result:

image

The distribution on the x-axis results from sweep number meta data present in the result wave. In your data the two incoming groups have different sweep numbers, so on a numerical basis it does not make sense to join the sweep numbers somehow together.

The only other option would be to create some text wave where the sweep numbers of the incoming groups are joined like e.g. 69_19 and let the plotter do a category plot. With more input groups the category labels would get rather long.

@timjarsky
Copy link
Collaborator

@MichaelHuth this question is tangental to this PR. I expect the following three code snippets to return the same thing, but one doesn't. Am I misunderstanding what the "intersection" of select operations means?

selrange("E1") is contained in a single select statement. plots the E1 epoch

fullSelect = select(selexp("*.10A.*"), $selbase, selchannels(AD0), selvis(all),selsweeps(), selstimset("*rheo*", "*supra*"), selrange("E1"))

data($fullSelect)

selrange("E1") is contained in the "base" select statement. Plots the entire stimulus set

selbase = select(selvis(all),selsweeps(), selstimset("*rheo*", "*supra*"), selrange("E1"))

selexpA = select(selexp("*.10A.*"), $selbase, selchannels(AD0))

data($selexpA)

selrange("E1") is in the combo select statement, plots the E1 epoch
avgMethodTesting.pxp.zip

selbaseTWO = select(selvis(all),selsweeps(), selstimset("*rheo*", "*supra*"), )

selexpTWO = select(selexp("*.10A.*"), $selbaseTWO, selchannels(AD0), selrange("E1"))

data($selexpTWO)

do selectrange() and selectchannels() need to be in the same select statment for selectrange() to work?

@MichaelHuth
Copy link
Collaborator Author

@MichaelHuth this question is tangental to this PR. I expect the following three code snippets to return the same thing, but one doesn't. Am I misunderstanding what the "intersection" of select operations means?

The range does not intersect. This is also in the select documentation:
The range specified through selrange is always taken from the topmost select.

The background here is that selrange can contain either epoch specifications with optional wildcards or time intervals. The actual resolve of the epoch to time intervals is done when data is applied (i.e. when the sweep data is retrieved). But let's say we would already do that part of data in the select statement and resolve the time intervals. Then the next issue comes up and this is the question how to exactly do the range intersection?

If there is an inner selrange("E*") that would resolve to time intervals [100, 200] and [300, 400] and then there is an outer selrange to intersect with e.g. [150, 350] and [380, 500] then I can intersect every range with every other:

[100, 200] with [150, 350] -> [150, 200]
[100, 200] with [380, 500] -> none
[300, 400] with [150, 350] -> [300, 350]
[300, 400] with [380, 500] -> [300, 380]

Is this really useful for sweep data?

Another approach would be to do intersection through epoch names by the names, if both sides of the intersection contain the same name. e.g. "E*" would possibly resolve to "E1", "E2", "E3". Intersected with "E2" results in "E2".

But what about "E2" with "ST"?

So in summary, intersecting ranges with selects currently not done. It is in principle possible, if there is a clear definition how this intersection should operate as it is not straight forward. It would require additional changes in SF for select because the actual ranges would need to be determined earlier (now: in data, then additionally: in each select). It would make select slower.

@timjarsky
Copy link
Collaborator

Hi @MichaelHuth , thank you for the clear explanation.

I define the intersection as the set of regions that overlap. So I would do the following:

[100, 200] with [150, 350] -> [150, 200]
[100, 200] with [380, 500] -> none
[300, 400] with [150, 350] -> [300, 350]
[300, 400] with [380, 500] -> [380, 400]

However, I can work with it as it is and would want to talk more before implementing anything.

Thanks again.

@timjarsky
Copy link
Collaborator

Hi @MichaelHuth, the latest changes may have broken averaging; I expect the bottom plot to contain the average of the to FI arrays in the top plot.

avgMethodTesting 2.pxp.zip

sel = select(selsweeps(), selstimset("*rheo*", "*supra*"), selvis(all))

selexpA = select(selexp("*.10A.*"), $sel, selchannels(AD0), selrange(E1))

selexpB = select(selexp("*.10B.*"), $sel, selchannels(AD0), selrange(E1))

selDAA = select(selexp("*.10A.*"), $sel, selchannels(DA0), selrange(E1))

selDAB = select(selexp("*.10B.*"), $sel, selchannels(DA0), selrange(E1))

freqA = apfrequency(data($selexpA))

currentA = max(data($selDAA))

freqB = apfrequency(data($selexpB))

currentB = max(data($selDAB))


$freqA

vs

$currentA

with

$freqB

vs 

$currentB

and

avg([$freqA, $freqB], sweepgroup)

vs

avg([$currentA, $currentB], sweepgroup)
image

@MichaelHuth
Copy link
Collaborator Author

I looked into it and the issue is actually not the averaging. What happens is that for Min and Max the wave scale is copied to the result. Thus, the result value from Max has an x dimension offset that equals the time when epoch E1 starts. Now the E1 from selDAA and selDAB start at different times. Thus, the Max results have a different x-offset. The averaging is x-offset aware and does not return a single value but four in this case (due to average creating common x points). So the result is something like [557, NaN, NaN, 647].

Then for plotting our heuristics kicks in because the plotter sees then something like [4] vs [557, NaN, NaN, 647], where the number of points for an X vs Y plot do not match.

I am not sure what the original use case was that it makes sense to transfer the x-offset when doing max to the result. Our tests do not verify that behavior.
I have removed the scale transfer and get:
image

The result looks good imho. The colors in the upper plot do not match. I added small numbers to the first two pairs as indicators:

image

@MichaelHuth MichaelHuth assigned timjarsky and unassigned MichaelHuth Nov 19, 2025
@timjarsky
Copy link
Collaborator

@MichaelHuth okay, the latest change is helpful:

image

Now I want to subtract the first value in $currentA from all of $currentA. I imagined the syntax like this: $currentA - $currentA[0]. Is there a way to do that?

@MichaelHuth
Copy link
Collaborator Author

MichaelHuth commented Nov 20, 2025

At the moment there is not. We have an operation dataset to join data to a dataset, but we have no operation to extract a single dataset from an expression that returns multiple datasets. Indexing through brackets is not supported by SF.

Adding indexing support with brackets to SF would be a big change because the parser/executor would need changes for that. A way with less effort is to introduce a new operation like
extract($data, <index>)

Such that you can write $currentA - extract($currentA, 0). The logic in minus extends the second term again temporarily to have the same number of datasets as $currentA (but with the value from the first).

I will add an operation for this, so you can progress.

@MichaelHuth
Copy link
Collaborator Author

@timjarsky I added the extract operation that allows to extract a single dataset from a multi dataset expression. Your substraction expression looks like: $currentA - extract($currentA, 0) then.

I noticed that there is historically no explicit discrimination between arrays and datasets in SF. Thus, for some constructs the executor joins datasets to an array, especially when entered directly (not as result of an operation).
This was not yet a big issue as we did not evaluated arrays of datasets.
So some expressions like avg([dataset(1), dataset(2)],sweepgroup) give an error as currently the executor converts [dataset(1),dataset(2)] -> [1,2]. [1,2] is seen as a single dataset and then the sweepgroup mode is missing the second expected dataset.

@timjarsky
Copy link
Collaborator

@MichaelHuth

Works!

sel = select(selsweeps(), selstimset("*rheo*", "*supra*"), selvis(all))

selexpA = select(selexp("*.10A.*"), $sel, selchannels(AD0), selrange(E1))

selexpB = select(selexp("*.10B.*"), $sel, selchannels(AD0), selrange(E1))

selDAA = select(selexp("*.10A.*"), $sel, selchannels(DA0), selrange(E1))

selDAB = select(selexp("*.10B.*"), $sel, selchannels(DA0), selrange(E1))

freqA = apfrequency(data($selexpA))

currentA = max(data($selDAA))
currentANorm = $currentA - extract($currentA, 0)

freqB = apfrequency(data($selexpB))

currentB = max(data($selDAB))
currentBNorm = $currentB - extract($currentB, 0)


$freqA

vs

$currentA - extract($currentA, 0)

with

$freqB 

vs 

$currentB - extract($currentB, 0)

with

avg([$freqA, $freqB], sweepgroup)

vs

avg([$currentANorm, $currentBNorm], sweepgroup)
image

Nitpick: the sweep numbering shown in the legend for the average trace reflects only one dataset, not both.

I can trigger a color selection assertion by wrapping the last two lines in a merge. I see you may have fixed this in a different PR, but I wanted to confirm.

..
merge(avg([$freqA, $freqB], sweepgroup))

vs

merge(avg([$currentANorm, $currentBNorm], sweepgroup))
################################
  !!! Assertion FAILED !!!
  Message: "Invalid color group"
  Please provide the following information if you contact the MIES developers:
  ################################
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Stacktrace:
  SF_button_sweepFormula_display(...)#L1667 [MIES_SweepFormula.ipf]
SF_FormulaPlotter(...)#L903 [MIES_SweepFormula.ipf]
SF_GetTraceColor(...)#L392 [MIES_SweepFormula.ipf]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Time: 2025-11-22T08:38:57-08:00
  Locked device: [- none -]
  Current sweep: [- none -]
  DAQ: [- none -]
  Testpulse: [- none -]
  Acquisition state: [- none -]
  Experiment: avgMethodTesting (Packed)
  Igor Pro version: 9.06B01 (56565)
  MIES version:
  
  ################################

I plan to create a follow-up issue to enhance APfrequency so that generating these plots is less tedious when using pipeline data acquired with analysis functions.

@timjarsky timjarsky assigned MichaelHuth and unassigned timjarsky Nov 22, 2025
@timjarsky timjarsky requested a review from Copilot November 22, 2025 16:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new "sweepgroup" mode for the SF average operation and introduces a new extract operation for extracting elements from datasets.

  • Adds SF_OP_AVG_SWEEPGROUPS mode to the avg operation, enabling averaging across sweep groups
  • Implements the extract operation to retrieve specific indices from datasets
  • Refactors helper functions to support select-type argument resolution and dataset array handling

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
UTF_SweepFormula_Operations.ipf Adds test coverage for the new extract operation
MIES_SweepFormula_PSX.ipf Removes unused opShort parameter from SFH_GetArgumentSelect calls
MIES_SweepFormula_Operations_TP.ipf Removes unused opShort parameter from SFH_GetArgumentSelect call
MIES_SweepFormula_Operations.ipf Implements new sweepgroup mode for avg operation, adds extract operation, and removes redundant scale operations
MIES_SweepFormula_Helpers.ipf Extends helper functions to support select resolution and dataset array handling
MIES_SweepFormula_Executor.ipf Adds case handler for new extract operation
MIES_SweepFormula.ipf Registers the new extract operation
MIES_Constants.ipf Defines constant for the new extract operation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MichaelHuth
Copy link
Collaborator Author

Nitpick: the sweep numbering shown in the legend for the average trace reflects only one dataset, not both.

The background for this is the following:

The sweepgroup average mode takes any datasets as groups as input. Each dataset has its meta data, like e.g. sweep number. Until now we did not have arguments in operations where we can input datasets with different meta data. The new average mode allows that: avg([$ds1, $ds2],sweepgroup) where $ds1 might be datasets with sweep number as meta data and $ds2 might be datasets that just contain some arbitrary wave data with different meta data.

Now we have some logic where we look at the meta data of a result in the plotter to determine how to display it on an x-axis. One thing evaluated is, if the data is a single data point and the meta data contains a sweep number. Then the plot with the known colored dots is created (as seen above).

Now in the data from the example experiment data from sweep 19 is averaged with data from sweep 69, 20 with 70 and so on.
How to join these?

  1. Do not join, just use meta data of first group for avg result
  2. Generate arbitrary sweep numbers by just indexing the group, like 0, 1, 2, 3, 4, ...
  3. Combine sweep numbers by multiplying each subsequent groups sweep numbers by 100 -> 69019, 70020, 71021, ...
  4. Introduce virtual negative sweep numbers and reverse x-axis, like 0, -1, -2, -3, -4, ... (and tag result to contain virtual sweep numbers)
  5. Clear any meta data in the result -> all avg results in the example experiment get plotted at the same x
  6. .. probably more ways

And this sketches only possibilities if all groups have sweep numbers, what if one group has sweep numbers and the next has some different meta data property?

Currently I implemented point 1.

@t-b
Copy link
Collaborator

t-b commented Nov 26, 2025

I can trigger a color selection assertion by wrapping the last two lines in a merge. I see you may have fixed this in a different PR, but I wanted to confirm.

Yes I think this is fixed in main.

In default setting the function enforces that at the given position is a
select type expression or array of it. It either gives an assertion or
defaults to the standard select expression select().

When doNotEnforce is set then the function returns a null wave if any
condition is not met that the content of the argument is actually a select
type. This allows for a caller to attempt a resolution of the argument as
select type data (or array of it)  and handle it differently if the content is of a different
type.
…ion to data

If the first argument of avg is a selection or an array of selections it
is now automatically converted to the sweep data selected.
@MichaelHuth MichaelHuth force-pushed the feature/2565-sf_new_average_mode branch from c4d3f89 to f0e5dc8 Compare November 28, 2025 14:43
- a new mode was added "group"
- the user has to give an array of datasets with at least 2 elements
  as first argument
- the averaging is done over the first entries of the datasets, then
  over the second and so on

ds0 = dataset(1, 2, 3)
ds1 = dataset(4, 5, 6)

avg([$ds1, $ds2], group)

Then
1 is averaged with 4
2 is averaged with 5
3 is averaged with 6

and the resulting dataset contains three elements with single point waves
equal to

dsResult = dataset(2.5, 3.5, 4.5)
…s result

The API is: extract(multi_dataset, index)

extract returns a single dataset.

Added test
@MichaelHuth MichaelHuth force-pushed the feature/2565-sf_new_average_mode branch from f0e5dc8 to 32a7878 Compare November 28, 2025 15:02
@MichaelHuth MichaelHuth assigned t-b and unassigned MichaelHuth Nov 28, 2025
@MichaelHuth
Copy link
Collaborator Author

I renamed the mode to just group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SF: Add new mode to average

4 participants